AITopics | high-cardinality categorical feature

Collaborating Authors

high-cardinality categorical feature

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

d35b05a832e2bb91f110d54e34e2da79-Paper.pdf

Neural Information Processing SystemsFeb-11-2026, 07:41:03 GMT

categorical feature, matrix, neural network, (15 more...)

Neural Information Processing Systems

Country:

Europe > United Kingdom (0.28)
Asia > Middle East > Israel > Tel Aviv District > Tel Aviv (0.05)
North America > United States (0.04)

Genre: Research Report (0.94)

Industry:

Health & Medicine > Therapeutic Area (0.68)
Health & Medicine > Health Care Technology (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Using Random Effects to Account for High-Cardinality Categorical Features and Repeated Measures in Deep Neural Networks

Neural Information Processing SystemsDec-24-2025, 23:12:39 GMT

High-cardinality categorical features are a major challenge for machine learning methods in general and for deep learning in particular. Existing solutions such as one-hot encoding and entity embeddings can be hard to scale when the cardinality is very high, require much space, are hard to interpret or may overfit the data. A special scenario of interest is that of repeated measures, where the categorical feature is the identity of the individual or object, and each object is measured several times, possibly under different conditions (values of the other features). We propose accounting for high-cardinality categorical features as random effects variables in a regression setting, and consequently adopt the corresponding negative log likelihood loss from the linear mixed models (LMM) statistical literature and integrate it in a deep learning framework. We test our model which we call LMMNN on simulated as well as real datasets with a single categorical feature with high cardinality, using various baseline neural networks architectures such as convolutional networks and LSTM, and various applications in e-commerce, healthcare and computer vision. Our results show that treating high-cardinality categorical features as random effects leads to a significant improvement in prediction performance compared to state of the art alternatives. Potential extensions such as accounting for multiple categorical features and classification settings are discussed. Our code and simulations are available at https://github.com/gsimchoni/lmmnn.

categorical feature and repeated measure, high-cardinality categorical feature, random effect, (4 more...)

Neural Information Processing Systems

Genre: Research Report > New Finding (0.59)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

MMbeddings: Parameter-Efficient, Low-Overfitting Probabilistic Embeddings Inspired by Nonlinear Mixed Models

Simchoni, Giora, Rosset, Saharon

arXiv.org Machine LearningNov-4-2025

We present MMbeddings, a probabilistic embedding approach that reinterprets categorical embeddings through the lens of nonlinear mixed models, effectively bridging classical statistical theory with modern deep learning. By treating embeddings as latent random effects within a variational autoencoder framework, our method substantially decreases the number of parameters -- from the conventional embedding approach of cardinality $\times$ embedding dimension, which quickly becomes infeasible with large cardinalities, to a significantly smaller, cardinality-independent number determined primarily by the encoder architecture. This reduction dramatically mitigates overfitting and computational burden in high-cardinality settings. Extensive experiments on simulated and real datasets, encompassing collaborative filtering and tabular regression tasks using varied architectures, demonstrate that MMbeddings consistently outperforms traditional embeddings, underscoring its potential across diverse machine learning applications.

artificial intelligence, categorical feature, machine learning, (16 more...)

arXiv.org Machine Learning

2510.22198

Country:

Europe (0.46)
Asia > Middle East > Israel (0.14)

Genre: Research Report > New Finding (0.46)

Industry: Health & Medicine > Therapeutic Area > Oncology (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (0.88)

Add feedback

Using Random Effects to Account for High-Cardinality Categorical Features and Repeated Measures in Deep Neural Networks

Neural Information Processing SystemsAug-17-2025, 13:21:04 GMT

Figure 2: Real data predicted vs. true results and category size distribution Python 3.8 Numpy + Pandas suite, Keras and Tensorflow Code is fully available in the lmmnn package on Github Running code: see details in package README file 3 n = 100, 000, σ At each run 80% (80,000) of the simulated data is used as training set, of which 10% (8,000) is used as validation set which the network only uses to check for early stopping. Embedding layer which maps q levels to a d = 0 .1 q vector, so input dimension is p + d - Physical activity (P A) definition: Subjects wore an accelerometer on their wrist for 7 days. ENMO in m-g was summarised across valid wear-time. ETL: We follow instructions by Pearce et al. (2020), implemented in R. At high level, we "once a week" is converted to 1 and "every day" is converted to 7. Finally the P A dependent variable is standardized to have a Baseline DNN architecture: Pearce et al. did not use DNNs, but two separate linear regressions, for men and women. ReLU activation of 10 and 5 neurons, followed by a single output neuron with no activation.

artificial intelligence, deep learning, machine learning, (14 more...)

Neural Information Processing Systems

Country:

Asia > Middle East > Israel > Tel Aviv District > Tel Aviv (0.05)
Europe > United Kingdom (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.50)

Add feedback

Using Random Effects to Account for High-Cardinality Categorical Features and Repeated Measures in Deep Neural Networks

Neural Information Processing SystemsAug-17-2025, 13:21:00 GMT

A special scenario of interest is that of repeated measures, where the categorical feature is the identity of the individual or object, and each object is measured several times, possibly under different conditions (values of the other features).

artificial intelligence, categorical feature, machine learning, (18 more...)

Neural Information Processing Systems

Country:

Europe > United Kingdom (0.28)
Asia > Middle East > Israel > Tel Aviv District > Tel Aviv (0.05)
North America > United States (0.04)

Genre: Research Report (0.94)

Industry:

Health & Medicine > Therapeutic Area (0.68)
Health & Medicine > Health Care Technology (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Using Random Effects to Account for High-Cardinality Categorical Features and Repeated Measures in Deep Neural Networks

Neural Information Processing SystemsJan-19-2025, 07:17:30 GMT

categorical feature, categorical feature and repeated measure, high-cardinality categorical feature, (2 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Integrating Random Effects in Variational Autoencoders for Dimensionality Reduction of Correlated Data

Simchoni, Giora, Rosset, Saharon

arXiv.org Machine LearningDec-24-2024

Variational Autoencoders (VAE) are widely used for dimensionality reduction of large-scale tabular and image datasets, under the assumption of independence between data observations. In practice, however, datasets are often correlated, with typical sources of correlation including spatial, temporal and clustering structures. Inspired by the literature on linear mixed models (LMM), we propose LMMVAE -- a novel model which separates the classic VAE latent model into fixed and random parts. While the fixed part assumes the latent variables are independent as usual, the random part consists of latent variables which are correlated between similar clusters in the data such as nearby locations or successive measurements. The classic VAE architecture and loss are modified accordingly. LMMVAE is shown to improve squared reconstruction error and negative likelihood loss significantly on unseen data, with simulated as well as real datasets from various applications and correlation scenarios. It also shows improvement in the performance of downstream tasks such as supervised classification on the learned representations.

categorical feature, dataset, reconstruction error, (13 more...)

arXiv.org Machine Learning

2412.16899

Country:

Europe > United Kingdom (0.28)
Asia > Middle East > Israel > Tel Aviv District > Tel Aviv (0.04)
North America > United States > New York > New York County > New York City (0.04)
Asia > Middle East > Palestine (0.04)

Genre: Research Report (1.00)

Industry: Health & Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Dimensionality Reduction (0.61)

Add feedback

Subject-specific Deep Neural Networks for Count Data with High-cardinality Categorical Features

Lee, Hangbin, Ha, Il Do, Hwang, Changha, Lee, Youngjo

arXiv.org Machine LearningOct-17-2023

Deep neural networks (DNNs), which have been proposed to capture the nonlinear relationship between input and output variables (LeCun et al., 2015; Goodfellow et al., 2016), provide outstanding marginal predictions for independent outputs. However, in practical applications, it is common to encounter correlated data with high-cardinality categorical features, which can pose challenges for DNNs. While the traditional DNN framework overlooks such correlation, random effect models have emerged in statistics to make subject-specific predictions for correlated data. Lee and Nelder (1996) proposed hierarchical generalized linear models (HGLMs), which allow the incorporation of random effects from an arbitrary conjugate distribution of generalized linear model (GLM) family. Both DNNs and random effect models have been successful in improving prediction accuracy of linear models but in different ways. Recently, there has been a rising interest in combining these two extensions. Simchoni and Rosset (2021, 2023) proposed the linear mixed model neural network for continuous (Gaussian) outputs with Gaussian random effects, which allow explicit expressions for likelihoods. Lee and Lee (2023) introduced the hierarchical likelihood (h-likelihood) approach, as an extension of classical likelihood for Gaussian outputs, which provides an efficient likelihood-based procedure. For non-Gaussian (discrete) outputs, Tran et al. (2020) proposed a Bayesian approach for DNNs with normal random effects using the variational approximation method (Bishop and Nasrabadi, 2006; Blei

artificial intelligence, machine learning, random effect, (19 more...)

arXiv.org Machine Learning

2310.11654

Country: Asia > South Korea > Seoul > Seoul (0.04)

Genre: Research Report > Experimental Study (1.00)

Industry: Health & Medicine > Therapeutic Area (0.95)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.34)

Add feedback

Machine Learning with High-Cardinality Categorical Features in Actuarial Applications

Avanzi, Benjamin, Taylor, Greg, Wang, Melantha, Wong, Bernard

arXiv.org Artificial IntelligenceJan-30-2023

High-cardinality categorical features are pervasive in actuarial data (e.g. occupation in commercial property insurance). Standard categorical encoding methods like one-hot encoding are inadequate in these settings. In this work, we present a novel _Generalised Linear Mixed Model Neural Network_ ("GLMMNet") approach to the modelling of high-cardinality categorical features. The GLMMNet integrates a generalised linear mixed model in a deep learning framework, offering the predictive power of neural networks and the transparency of random effects estimates, the latter of which cannot be obtained from the entity embedding models. Further, its flexibility to deal with any distribution in the exponential dispersion (ED) family makes it widely applicable to many actuarial contexts and beyond. We illustrate and compare the GLMMNet against existing approaches in a range of simulation experiments as well as in a real-life insurance case study. Notably, we find that the GLMMNet often outperforms or at least performs comparably with an entity embedded neural network, while providing the additional benefit of transparency, which is particularly valuable in practical applications. Importantly, while our model was motivated by actuarial applications, it can have wider applicability. The GLMMNet would suit any applications that involve high-cardinality categorical variables and where the response cannot be sufficiently modelled by a Gaussian distribution.

artificial intelligence, glmmnet, machine learning, (20 more...)

arXiv.org Artificial Intelligence

2301.1271

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Middle East > Jordan (0.04)
Oceania > New Zealand (0.04)
(7 more...)

Genre:

Research Report > New Finding (0.67)
Research Report > Experimental Study (0.46)

Industry: Banking & Finance > Insurance (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Add feedback

Integrating Random Effects in Deep Neural Networks

Simchoni, Giora, Rosset, Saharon

arXiv.org Artificial IntelligenceJan-27-2023

Modern approaches to supervised learning like deep neural networks (DNNs) typically implicitly assume that observed responses are statistically independent. In contrast, correlated data are prevalent in real-life large-scale applications, with typical sources of correlation including spatial, temporal and clustering structures. These correlations are either ignored by DNNs, or ad-hoc solutions are developed for specific use cases. We propose to use the mixed models framework to handle correlated data in DNNs. By treating the effects underlying the correlation structure as random effects, mixed models are able to avoid overfitted parameter estimates and ultimately yield better predictive performance. The key to combining mixed models and DNNs is using the Gaussian negative log-likelihood (NLL) as a natural loss function that is minimized with DNN machinery including stochastic gradient descent (SGD). Since NLL does not decompose like standard DNN loss functions, the use of SGD with NLL presents some theoretical and implementation challenges, which we address. Our approach which we call LMMNN is demonstrated to improve performance over natural competitors in various correlation scenarios on diverse simulated and real datasets. Our focus is on a regression setting and tabular datasets, but we also show some results for classification. Our code is available at https://github.com/gsimchoni/lmmnn.

artificial intelligence, categorical feature, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2206.03314

Country:

Asia > Middle East > Israel > Tel Aviv District > Tel Aviv (0.04)
Asia > Japan (0.04)
Oceania > Australia (0.04)
(6 more...)

Genre: Research Report (0.82)

Industry:

Health & Medicine > Therapeutic Area > Oncology (1.00)
Education (1.00)
Government (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback